The funprog package provides high-order functions for data manipulation (grouping, sorting, …). It takes its inspiration from other pure functional programming languages (hence the name).

## Installation

You can install funprog from the CRAN with :

install.packages("funprog")

To get the development version, you can install it from gitlab :

# install.packages("remotes")
remotes::install_gitlab("py_b/funprog")

## Main functions

Currently, the main functions are all inspired by Haskell functions.

### group_if

group_if splits a vector or a list into groups, given a predicate function.

The predicate is a binary function returning a boolean, applied to every couple of adjacent elements. If it evaluates to TRUE, those elements belong to the same group, otherwise they belong to different groups.

Using the equality as a predicate is frequent, therefore group_eq is a shortcut defined as group_if(x, ==) for atomic vectors and group_if(x, identical) for other types.

x1 <- c(2, 4, 2, 2, 1, 1, 1, 3)

str(group_eq(x1)) # shortcut for group_if(x1, ==)
#> List of 5
#>  $: num 2 #>$ : num 4
#>  $: num [1:2] 2 2 #>$ : num [1:3] 1 1 1
#>  $: num 3 str(group_if(x1, <=)) # returns non decreasing sequences #> List of 3 #>$ : num [1:2] 2 4
#>  $: num [1:2] 2 2 #>$ : num [1:4] 1 1 1 3

str(group_if(x1, function(x, y) abs(x - y) > 1))
#> List of 5
#>  $: num [1:3] 2 4 2 #>$ : num 2
#>  $: num 1 #>$ : num 1
#>  $: num [1:2] 1 3 group_if is inspired by Haskell’s groupBy function. ### %on% %on% is a convenient operator that combines a binary function and a unary function to create a binary function. Formally, (f %on% g)(x, y) is defined as f(g(x), g(y)). For instance : (max %on% abs)(-2, 1) # max(abs(-2), abs(1)) #>  2 It may be helpful to create a easy-to-read predicates, particularly in conjunction with group_if : x2 <- c(2, 4, 2, -2, -1, 1, 1, 3) str(group_if(x2, == %on% abs)) #> List of 5 #>$ : num 2
#>  $: num 4 #>$ : num [1:2] 2 -2
#>  $: num [1:3] -1 1 1 #>$ : num 3

x3 <- list(1:3, 1:3, 3:5, 1, 2)
str(group_if(x3, == %on% length))
#> List of 2
#>  $:List of 3 #> ..$ : int [1:3] 1 2 3
#>   ..$: int [1:3] 1 2 3 #> ..$ : int [1:3] 3 4 5
#>  $:List of 2 #> ..$ : num 1
#>   ..$: num 2 %on% is inspired by Haskell’s on function. ### sort_by sort_by sorts a vector or a list, not on values itselves, but on a transformation of those values. sort_by(-3:2, abs) #>  0 -1 1 -2 2 -3 Additional functions can be used for breaking ties, as well as decreasing order. str(sort_by(list(1:2, 3:4, 5), length, descending(sum))) #> List of 3 #>$ : num 5
#>  $: int [1:2] 3 4 #>$ : int [1:2] 1 2

sort_by is inspired by Haskell’s sortBy function.

## A more elaborate example

sort_by and group_if can work together well to perform grouping operations. The next example row-binds data.frames sharing the same column names :

library(dplyr)

dfs <- list(
data.frame(A = 0:1, B = c("a", "b")),
data.frame(C = 3, D = 4, E = 5),
data.frame(A = 1),
data.frame(A = 3, B = "c"),
data.frame(C = 5:6, D = 7:8, E = 10:11)
)

dfs %>%
# sort_by to make data.frames sharing same names adjacent
sort_by(function(x) paste(names(x), collapse = "#")) %>%
# group_if on identical column names
group_if(identical %on% names) %>%
# concatenate data.frames belonging to the same group
lapply(bind_rows)
#> []
#>   A
#> 1 1
#>
#> []
#>   A B
#> 1 0 a
#> 2 1 b
#> 3 3 c
#>
#> []
#>   C D  E
#> 1 3 4  5
#> 2 5 7 10
#> 3 6 8 11

## Short syntax with purrr

If you have installed the purrr package, you can use the shortcut syntax for specifying arguments that are functions :

x1 <- c(2, 4, 2, 2, 1, 1, 1, 3)
str(group_if(x1, ~ abs(.x - .y) > 1))
#> List of 5
#>  $: num [1:3] 2 4 2 #>$ : num 2
#>  $: num 1 #>$ : num 1
#>  \$ : num [1:2] 1 3

x4 <- list(c(a = 1, b = 2, c = 3), c(a = 10, b = 20))
sort_by(x4, "b") # shortcut for function(x) x[["b"]]
#> []
#> a b c
#> 1 2 3
#>
#> []
#>  a  b
#> 10 20
sort_by(x4, descending(1)) # shortcut for descending(function(x) x[])
#> []
#>  a  b
#> 10 20
#>
#> []
#> a b c
#> 1 2 3